Implement the emitter

In this part of the assignment, we will implement an x86-64 emitter for our language.

This part is largely open-ended, with some guidance below. The class materials can also be helpful!

Overview

The emitter will take as input the AST and produce as output a string that contains equivalent x86-64 instructions, in AT&T syntax.

A program in our language can do one of two things:

Print a single string literal. In this case, the program should print the string and call the exit system call with an argument of 0.
Evaluate a mathematical expression. In this case, the program should call the exit system call with an argument whose value is the result of evaluating the expression.

Implementation guidance

Printing

For printing, use the system call for writing to standard out, along with a .data section that defines the string literal.

Expression evaluation

For evaluating the expression, generate x86 instructions that evaluate each part of the subexpression. The following algorithm can turn an AST into a list of instructions that evaluate the same expression as the AST.

There are opportunities to optimize this algorithm. See Where to go from here?.

If the AST is a single integer literal, emit an instruction that assigns that value to register \(r_1\).
If the AST is an arithmetic operation:
1. Recursively emit the left subtree of the operation. The result should then be in \(r_1\).
2. Emit an instruction that pushes \(r_1\) onto the stack.
3. Recursively emit the right subtree of the operation. The result should then be in \(r_1\).
4. Emit an instruction that pushes \(r_1\) onto the stack.
5. Emit an instruction that pops from the stack into \(r_2\).
6. Emit an instruction that pop from the stack into \(r_1\).
7. Emit an instruction that perform the operation, leaving the result in \(r_1\).

Assembler directives

To produce a complete x86 program that can run on a machine, we will need a few other pieces of assembly. For these pieces, we will use the syntax of a particular assembler, the gnu as assembler.

The pieces consist of directives (commands to the assembler) and symbols (named values). A label is a special kind of symbol that provides a name for a location in the program. Note that x86 instructions can use symbols as immediate values.

.global: The .global directive makes a symbol visible to the linker.
_start:: This label should appear before the first instruction in our program. The assembler expects the entry point to our program to be labelled with _start.
.data: The .data directive tells the assembler that the next section of the file will define data that the program uses. If our program prints a string literal, the contents of the string literal will go in this section.
.ascii <string-literal>: The .ascii directive describes a (non-nul-terminated) string literal.
len = . - <label>: This state assigns the length of the string literal to a symbol named len, by computing the offset between the current address (.) and the address of a specific <label> (e.g., _start or msg).

Putting it all together, our full program will be:

Note that we do not need a data section, if the program does not print anything

.global _start

.text
_start:

  ...<instructions from the emitter>...

.data
msg:
  .ascii  <string-literal>
  len =   . - msg

where <string-literal> is the string literal from our source program, including the quotation marks.

Testing the emitter

For now, the easiest way to test your emitter is probably to copy / paste the output of the compiler into the x86-64 playground.